-
-
Notifications
You must be signed in to change notification settings - Fork 828
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for telemetry metrics and tracing #1715
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left you some thoughts, take it or leave it. I didn't see anything that really is a problem! It's AMAZING how you've gone from 0 to hero in metrics gathering.
Once you have a (alpha|beta) nuget version, I'll try to create a Prometheus dashboard using my old POC project.
static TagList GetTags (IPAddress ip, string host, int port, Exception ex = null) | ||
{ | ||
var tags = new TagList { | ||
{ "network.peer.address", ip.ToString () }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should really avoid putting IPs, addresses and ports as tags in metrics. for use cases where you connect to few target servers it would be fine. however, if the target IP/host could change this could potentially cause the backing time-series database (like prometheus or influxDB) to operate poorly due to high cardinality tags, I've actually crashed our influxDB instance this way 🤣.
maybe, if the community wants such things you could add a config switch to add those. on the other hand these values could be probably filtered on export and before ingestion into the timeseries db.
your call :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, I didn't think it'd be that sensitive. But yea, the network.peer.address
is probably not all that useful compared to server.address
(i.e. hostname) and that's probably good enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also kinda wondering if SocketMetrics should just go away since it probably doesn't add a whole lot of value other than the error values.
Perhaps this would be the kind of thing that would be better as log info rather than metrics.
public void RecordClientDisconnected (long startTimestamp, Uri uri, Exception ex = null) | ||
{ | ||
if (ConnectionDuration.Enabled) { | ||
var duration = TimeSpan.FromTicks (Stopwatch.GetTimestamp () - startTimestamp).TotalSeconds; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most often than not, recording time better done in milliseconds rather than seconds. It'll also be consistent with your OperationDuration
metric units.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I got this from HttpClient using seconds for connection_duration and was thinking that clients would realistically be kept connected for long enough durations that ms
wouldn't make sense.
For IMAP, this would realistically be the case but perhaps for something like SmtpClient, it isn't necessarily true unless the developer is trying to keep a connection alive for re-use. Not sure how people are using MailKit's SmtpClient in server applications, so this may or may not be the case. System.Net.Mail's SmtpClient was much more like HttpClient in that it kept a connection pool in the background. MailKit, however, is a 1:1 mapping of Client to connection and developers might spawn up a new SmtpClient for every batch send that they do in which case ms
would probably be a more useful metric.
Gotta think about the units for this, I guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, OpenTelemetry and Prometheus standardize on seconds unit for duration. Cf https://opentelemetry.io/docs/specs/semconv/general/metrics/#instrument-units and https://prometheus.io/docs/practices/naming/#metric-names
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it could be either, but it should be:
- consistent between multiple metrics from the same source, be predictable
- should be sensitive enough to show time for short processes and long processes while still being consistent
so either should be fine, just have to pick 1, document it and stick to it :)
{ | ||
var tags = new TagList { | ||
{ "url.scheme", uri.Scheme }, | ||
{ "server.address", uri.Host }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though HttpClient
reports those tag values, you should also consider the best practices published by microsoft, specifically:
Beware of having very large or unbounded combinations of tag values being recorded in practice...Most metric collection tools will either drop data to stay within technical limits or there can be large monetary costs to cover the data storage and processing...Most metric collection tools will either drop data to stay within technical limits or there can be large monetary costs to cover the data storage and processing.
In some use cases MailKit could be used to relay emails to many servers (think about a multi-tenant scenario).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair. I can likely make this configurable.
this.activity = activity; | ||
this.metrics = metrics; | ||
|
||
if (activity is not null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think you might be better off with Activity.Current
then passing it down into the constructor
https://learn.microsoft.com/en-us/dotnet/api/system.diagnostics.activity.current?view=net-6.0
I'm not very fond of dynamic static (too much thread awareness for my personal taste), but it might be fine to avoid passing down the activity.
/// <remarks> | ||
/// Telemetry constants for MailKit. | ||
/// </remarks> | ||
public static class Telemetry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest adding another nuget MailKit.Extensions.Hosting
(following a common convention https://www.nuget.org/packages?q=extensions.hosting)
the nuget could add multiple extensions methods on top of IServiceCollection
like:
AddMailKitSmtp(config => {})
,AddMailKitPop3(config => {})
,AddMailKitImap(config => {})
AddMailKitSmtp("name", config => {})
,AddMailKitPop3("name", config => {})
,AddMailKitImap("name", config => {})
AddMailKitMetrics(config => {})
- this will add metrics reporting for mailkitAddMailKitTracing(config => {})
- this will add tracing reporting for mailkit
it might be a bit out of scope, but it would probably be a natural continuation of this project 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, I was planning to do something like this. Just wasn't sure what dependency was needed (now I know!)
Wow AWESOME :) |
No description provided.